Nonparametric Unsupervised Classification

نویسندگان

  • Yingzhen Yang
  • Thomas S. Huang
چکیده

Unsupervised classification methods learn a discriminative classifier from unlabeled data, which has been proven to be an effective way of simultaneously clustering the data and training a classifier from the data. Various unsupervised classification methods obtain appealing results by the classifiers learned in an unsupervised manner. However, existing methods do not consider the misclassification error of the unsupervised classifiers except unsupervised SVM, so the performance of the unsupervised classifiers is not fully evaluated. In this work, we study the misclassification error of two popular classifiers, i.e. the nearest neighbor classifier (NN) and the plug-in classifier, in the setting of unsupervised classification. The upper bound for the misclassification error of both classifiers involves only pairwise similarity between the data points. We prove that the error of the plug-in classifier is asymptotically bounded by the weighted volume of cluster boundary [1]. Also, the normalized graph Laplacian from the induced similarity kernel recovers different types of transition kernels for Diffusion maps [2,3], which reveals the close relationship between manifold learning and unsupervised classification. We show that with the normalized graph Laplacian, the similarity kernel induced by the misclassification error of the plug-in classifier corresponds to the Fokker-Planck operator; and the similarity kernel induced by the volume of misclassified region by the plug-in classifier correspond to the Laplace-Beltrami operator on the data manifold.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Nonlinear nonparametric mixed-effects models for unsupervised classification

In this work we propose a novel estimation method for nonlinear nonparametric mixed-effects models, aimed at unsupervised classification. The proposed method is an iterative algorithm that alternates a nonparametric EM step and a nonlinear Maximum Likelihood step. We perform simulation studies in order to evaluate the algorithm performances and we apply this new procedure to a real dataset.

متن کامل

On a Theory of Nonparametric Pairwise Similarity for Clustering: Connecting Clustering to Classification

Pairwise clustering methods partition the data space into clusters by the pairwise similarity between data points. The success of pairwise clustering largely depends on the pairwise similarity function defined over the data points, where kernel similarity is broadly used. In this paper, we present a novel pairwise clustering framework by bridging the gap between clustering and multi-class class...

متن کامل

Complex Models and Computational Methods in Statistics 123

In this work we propose a novel unsupervised classification technique based on the estimation of nonlinear nonparametric mixed-effects models. The proposed method is an iterative algorithm that alternates a nonparametric EM step and a nonlinear Maximum Likelihood step. We apply this new procedure to perform an unsupervised clustering of longitudinal data in two different case studies.

متن کامل

Fully Nonparametric Probability Density Function Estimation with Finite Gaussian Mixture Models

Flexible and reliable probability density estimation is fundamental in unsupervised learning and classification. Finite Gaussian mixture models are commonly used to serve this purpose. However, they fail to estimate unknown probability density functions when used for nonparametric probability density estimation, as severe numerical difficulties may occur when the number of components increases....

متن کامل

Deep Learning with Nonparametric Clustering

Clustering is an essential problem in machine learning and data mining. One vital factor that impacts clustering performance is how to learn or design the data representation (or features). Fortunately, recent advances in deep learning can learn unsupervised features effectively, and have yielded state of the art performance in many classification problems, such as character recognition, object...

متن کامل

Texture Classification Using Nonparametric Markov Random Fields

We present a nonparametric Markov Random Field model for classifying texture in images. This model can capture the characteristics of a wide variety of textures, varying from the highly structured to the stochastic. The power of our modelling technique is evident in that only a small training image is required, even when the training texture contains long range characteristics. We show how this...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012